Robust Efficient Conditional Probability Estimation

نویسنده

  • John Langford
چکیده

The problem is finding a general, robust, and efficient mechanism for estimating a conditional probability P (y|x) where robustness and efficiency are measured using techniques from learning reductions. In particular, suppose we have access to a binary regression oracle B which has two interfaces—one for specifying training information and one for testing. Training information is specified as B(x′, y′) where x′ is an unspecified feature vector and y′ ∈ [0, 1] is a bounded range scalar with no value returned. This operation is stateful, possibly altering the return value of the testing interface in arbitrary ways. Testing is done according to B(x′) with a value in [0, 1] returned. The testing operation operation is stateless. A learning reduction consists of two algorithms R and R−1. The algorithm R takes as input a single example (x, y) where x is a feature vector and y ∈ {1, ..., k} is a discrete variable. R then specifies a training example (x′, y′) for the oracle B. R can then create another training example for B based on all available information. This process repeats some finite number of times before halting without returning information. A basic observation is that for any oracle algorithm, a distribution D(x, y) over multiclass examples and a reduction R induces a distribution over a sequence (x′, y′)∗ of oracle examples. We collapse this into a distribution D′(x′, y′) over oracle examples by drawing uniformly from the sequence. The algorithm R−1 takes as input a single example (x, y) and returns a value v ∈ [0, 1] after using (only) the testing interface of B zero or more times. We measure the power of an oracle and a reduction according to squared-loss regret according to: reg(D,R−1) = E(x,y)∼D[(R(x, y)−D(y|x))] and similarly letting μx′ = E(x′,y′)∼D′ [y′]. reg(D′, B) = E(x′,y′)∼D′(B(x)− μx′) The open problem is to specify R and R−1satisfying the following theorem:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient model-free estimation of multiclass conditional probability

Conventional multiclass conditional probability estimation methods, such as Fisher’s discriminate analysis and logistic regression, often require restrictive distributional model assumption. In this paper, a model-free estimation method is proposed to estimate multiclass conditional probability through a series of conditional quantile regression functions. Specifically, the conditional class pr...

متن کامل

6. Conclusions

as the level of inputs correlation changes, but also the values of total power consumption. For example, for duke2, the total power estimated under weakly correlated inputs was 3611.67 uW, while this value for strongly correlated inputs was 820.87 uW (there is a factor of 4 between the two). The same behavior has been observed for other circuits.To conclude, input pattern dependencies (in parti...

متن کامل

Doubly Robust and Locally Efficient Estimation with Missing Outcomes

We consider parametric regression where the outcome is subject to missingness. To achieve the semiparametric efficiency bound, most existing estimation methods require the correct modeling of certain second moments of the data, which can be very challenging in practice. We propose an estimation procedure based on the conditional empirical likelihood (CEL) method. Our method does not require us ...

متن کامل

On doubly robust estimation in a semiparametric odds ratio model.

We consider the doubly robust estimation of the parameters in a semiparametric conditional odds ratio model. Our estimators are consistent and asymptotically normal in a union model that assumes either of two variation independent baseline functions is correctly modelled but not necessarily both. Furthermore, when either outcome has finite support, our estimators are semiparametric efficient in...

متن کامل

Efficient Simulation of a Random Knockout Tournament

We consider the problem of using simulation to efficiently estimate the win probabilities for participants in a general random knockout tournament. Both of our proposed estimators, one based on the notion of “observed survivals” and the other based on conditional expectation and post-stratification, are highly effective in terms of variance reduction when compared to the raw simulation estimato...

متن کامل

The Regularization Aspect of Optimal-Robust Conditional Value-at-Risk Portfolios

In portfolio management, Robust Conditional Value at Risk (Robust CVaR) has been proposed to deal with structured uncertainty in the estimation of the assets probability distribution. Meanwhile, regularization in portfolio optimization has been investigated as a way to construct portfolios that show satisfactory out-ofsample performance under estimation error. In this paper, we prove that optim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010